Automatic Optimization of Dialogue Management

نویسندگان

  • Diane J. Litman
  • Michael Kearns
  • Satinder P. Singh
  • Marilyn A. Walker
چکیده

Designing the dialogue strategy of a spoken dialogue system involves many nontrivial choices. This paper I)resents a reinforcement learning approach for automatically optimizing a dialogue strategy that addresses the technical challenges in applying reinforcement learning to a working dialogue system with hulnan users. \¥e then show that our approach measurably improves performance in an experimental system. 1 I n t r o d u c t i o n Recent advances in spoken language understanding have made it 1)ossible to develop dialogue systems tbr many applications. The role of the dialogue manager in such systems is to interact in a naturM w~y to hel 1 ) the user complete the tasks that the system is designed to support. Tyl)ically, an expert designs a dialogue manager by hand, and has to make m~ny nontrivial design choices that can seriously impact system performance. This paper applies reintbrcement learning (RL) to automatically learn design choices that optimize system pertbrnmnee for a choseLL pertbrmance measure (Levin et al., 2000; Walker et al., 1998). Consider the spoken dialogue system named N JFun, wlfich we built to help users find fun places to go in New Jersey. A sample dialogue with N JFun is shown in Figure 1, with system utterances labeled Si and user utterances labeled Ui. When NJFun greets the user, it can provide an open greeting such as utterance S1, expecting that the user will take the initiative. Alternatively, it can provide a directive greeting in which the system takes the initiative, as in "Welcome to NJFun. Please say an activity name or say 'list activities' for a list of activities I know about." After tile user's response U1, NJFun must decide whether it should explicitly confirm its understanding, as in utterances $2 and $3. NJFun can also simply continue on with the dialogue, as when it does not explicitly confirm that the user wants to find out M)out wineries. In NJFun, as shown in more detail below, decisions about initiative and confirmation strategies alone result in a SI: Welcome to N.llSm. How may I help you? UI: I'd like to find mn winetasting in Lambertville in the morning. (ASR output: I'd like to find out wirterics the in the Lambertvillc in the mornin.q. ) $2: Did you say you are interested in Lambertville? U2: Yes. $3: Did you say you want to go in the morning? U3: Yes. $4: I found a winery near l~ambertville that is open in the morning, it is [...] Please give lne feedback by saying 'good', ~so-so', or 'bad'. U4: Good. Figure 1: An example dialogue with NJFun. search space of 242 potential globnl dialogue strategies. Furthermore, the performance of a dialogue strategy depends on many other factors, such as the user population, the robustness of the automatic speech recognizer (ASR), and task difficulty (Kamm et al., 1998; DanMi and Gerbino, 1995). In the main, previous research has treated the specification of the dialogue management strategy as an iterative design problem: several versions of a system are created, dialogue corpora are collected with human users interacting with different versions of tile system, a number of evaluation metrics are collected ibr each dialogue, and the different versions are statistically compared (Danieli and Gerbino, 1995; Sanderman et al., 1998). Due to the costs of experimentation, only a few global strategies are typically explored in any one experiment. However, recent work has suggested that dialogue strategy can be designed using tile formalism of Markov decision processes (MDPs) and the algorithms of RL (Biermann and Long, 1996; Levin et al., 2000; Walker et nl., 1998; Singh et al., 1999). More specifically, the MDP formalism suggests a method for optimizing dialogue strategies from sample dialogue data. The main advantage of this approach is the 1)otential tbr computing an optilnal dialogue strategy within a much larger search space, using a relatively small nmnber of training dialogues. This paper presents an application of RL to the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning what to say and how to say it: Joint optimisation of spoken dialogue management and natural language generation

This paper argues that the problems of dialogue management (DM) and Natural Language Generation (NLG) in dialogue systems are closely related and can be fruitfully treated statistically, in a joint optimisation framework such as that provided by Reinforcement Learning (RL). We first review recent results and methods in automatic learning of dialogue management strategies for spoken and multimod...

متن کامل

Spoken Dialogue Management Using Hierarchical Reinforcement Learning and Dialogue Simulation

Speech-based human-computer interaction faces several difficult challenges in order to be more widely accepted. One of the challenges in spoken dialogue management is to control the dialogue flow (dialogue strategy) in an efficient and natural way. Dialogue strategies designed by humans are prone to errors, labour-intensive and non-portable, making automatic design an attractive alternative. Pr...

متن کامل

Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learning

Speech enabled interfaces are nowadays becoming ubiquitous. The most advanced ones rely on probabilistic pattern matching systems and especially on automatic speech recognition systems. Because of their statistical nature, performances of such systems never reach one hundred percent of correct recognition results. Performances are linked to environmental noise and to intraand inter-speaker vari...

متن کامل

Using Markov decision process for learning dialogue strategies

In this paper we introduce a stochastic model for dialogue systems based on Markov decision process. Within this framework we show that the problem of dialogue strategy design can be stated as an optimization problem, and solved by a variety of methods, including the reinforcement learning approach. The advantages of this new paradigm include objective evaluation of dialogue systems and their a...

متن کامل

Automatic Optimization of Dialogue Management

Designing the dialogue strategy of a spoken dialogue system involves many nontrivial choices. This paper presents a reinforcement learning approach for automatically optimizing dialogue strategy. We first present a practical methodology that addresses the technical challenges in applying reinforcement learning to a working dialogue system with human users. We then demonstrate how we have used t...

متن کامل

Hierarchical Reinforcement Learning of Dialogue Policies in a development environment for dialogue systems: REALL-DUDE

We demonstrate the REALL-DUDE system1, which is a combination of REALL, an environment for Hierarchical Reinforcement Learning, and DUDE, a development environment for “Information State Update” dialogue systems (Lemon and Liu, 2006) which allows non-expert developers to produce complete spoken dialogue systems based only on a Business Process Model (BPM) and SQL database describing their appli...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000